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ABSTRACT 

X-ray surveys contain sizable numbers of star forming galaxies, beyond the AGN which usually make the majority of detections. 
Many methods to separate the two populations are used in the literature, based on X-ray and multiwavelength properties. We aim at a 

detailed test of the classification schemes and to study the X-ray properties of the resulting samples. 

We build on a sample of galaxies selected at 1.4 GHz in the VLA-COSMOS survey, classified by Smolcic et al. (2008) according to 
their optical colours and observed with Chandra. A similarly selected control sample of AGN is also used for comparison. We review 
some X-ray based classi fication criteria and check how they affect the sample composition. The efficiency of the classification scheme 
devised by Smol cic et al] (2008) is such that ~ 30% of composite/misclassified objects are expected because of the higher X-ray 
brightness of AGN with respect to galaxies. The latter fraction is actually 50% in the X-ray detected sources, while it is expected to be 
much lower among X-ray undetected sources. Indeed, the analysis of the stacked spectrum of undetected sources shows, consistently, 
strongly different properties between the AGN and galaxy samples. X-ray based selection criteria are then used to refine both samples. 
The radio/X-ray luminosity correlation for star forming galaxies is found to hold with the same X-ray/radio ratio valid for nearby 
galaxies. Some evolution of the ratio may be possible for sources at high redshift or high luminosity, tough it is likely explained by 
a bias arising from the radio selection. Finally, we discuss the X-ray number counts of star forming galaxies from the VLA- and 
C-COSMOS surveys according to different selection criteria, and compare them to the similar determination from the Chandra Deep 
Fields. The classification scheme proposed here may find application in future works and surveys. 

Key words. X-rays: galaxies - radio continuum: galaxies - galaxies: fundamental parameters - galaxies: active - galaxies: high 
redshift 



1. Introduction the se arch for ga laxies at high redshifts ([ Alexander et al. 

| 2002b iBauer et all l2002t iHornschemeier et all l2003t iRanalli 

Radio and far-infrared observations have been widely accepted 2003). The galaxies X-ray luminosity function and its evo- 

as unbiased estimators of star formation (SF) in sp i ral ga lax- i ut i n has been i nvestigated both in the loca l universe and 

ies for decades (see the ICondonl Il992t iKennicuttl Il998l re- at high redshift dGeorgantopoulos et all ll999UNorman et ail 

^ ■ views). The X-ray domain has also been recognized as a SF |2004LlGeorgantopoulos et alj2005tlRanalli et al.M2005l hereafter 

tracer in non-active galaxies (hereafter just "galaxies") thanks RCS05; Georgaka kis et alj|2007l: iPtak et al.ll2007t iLehmer et al] 

to a number of works hig hlighting the presence of X-ray vs. ra- |2QQ8l) with also the goals of obtaining an absorption-free esti- 

dio/infrared correla tions riDavid et al.lll992l iGrimm et al.ll2003l ma te of the cosmic star formation history, and deriving the con- 

iRanalli et al] I2003L hereafter RCS03; iGilfanov et all l2004al) . tribution by galaxies to the X-ray background. 
Strong absorption (i.e. with column densities > 10 22 ctrT 2 ) is 

also rare among galaxies, making the X-ray domain scarcely However, any work involving galaxies in X-ray surveys has 
sensitive to extinction. Thus, an X-ray based Star Formation to deal with the fundamental fact that AGN are preferentially 
Rate (SFR) indicator can be considered not biased by absorp- selected in flux-limited X-ray surveys. A careful and efficient 
tion (RCS03). An interpretation framework, whose main idea classification of the detected objects is necessary to identify the 
is the dominance of High-Mass X-ray Binaries among the con- galaxies among the dom inant AGN population. In early studies 
tributors to the X-ray luminosity of galaxies, has also been (LMaccacar o et al . 1988) AGN were found to populate a well de- 
developed (Gilfano v et alJl2004bt iPersic & Rephaelill2007l) and fined region of the X-ray/optical vs. X-ray flux plane, bounded 
is currently the subject of further investigation. The observa- by an X-ray /optical flux ratio X/O = -1 (see definition in 
tions of deep fields, especially with Chandra, have prompted Sect. 14.21 ). This threshold has been often adopted as an approx- 
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imate line dividing AGN and X-ray emitting galaxies. A more 
robust separation between AGN and star forming galaxies is ob- 
tained dXue et al.ll201 U IVattakun nel et al.l l2012) by considering 
several different criteria (X-ray hardness ratio, X-ray luminosity, 
optical spectroscopy, X-ray/infrared or X-ray /radio flux ratios). 
An analysis of the relative merits of the different criteria when 
taken separately, and of the most effective trade-offs to identify 
star-forming galaxies is one of the aims of this paper. 

A similar need of a careful object classification has arisen 
in deep radio observations. It has been known since long 
that a population of faint radio sources associated with faint 
blue galaxies was eme r ging at radio fluxes b e low ~ 1 mJy 



([Windhorst et al.1 119851: iFomalont etail I199U iRichards et all 
ll998tlRichards[ |2000). In recent years, it has been shown that a 



sizable fraction (about 50%) of this sub-mJy population is actu- 
ally made up o f AGN (Gmpp ioni et al .11 1 999t ICilie gi et al.l|2 003: 
[ Seymour et aill2008t ISmolcic et al.ll2008l: iPadovani et al.1 12009: 
Strazzullo et al.ll2010h . This means that an accurate screening is 
needed also for radio-selected faint galaxies. 

This scre e ning has been the subject of the work by 
Smolcic et al. (2008, hereafter S08), who made use of the ex- 
tensive data sets of the COSMOS survey to analyze ~ 2400 
sub-mJy radio sources and classified them according to a newly 
developed, photometry-based method to separate SF galaxies 
and AGN. Their method is based on optical rest-frame synthetic 
colours, which are the result of a principal component analysis 
of many combinations of narrow -band colours, and which corre- 
late with the position of the objects in the classical BPT diagram 
dBaldwin et al.ll 198 it see Sect.|2]for details). 

Here we build on this work, and use the S08 samples as 
the starting point for our classification of the X-ray galaxies in 
COSMOS. We intend to test the X-ray based selection criteria 
against the S08 method, and eventually refine the selection. 

The Cosmological Evolution Survey (COSMOS) is an all- 
wavelength survey, from radio to X-ray, designed to probe the 
formation and evolution of astronomical objects as a function 
of cosmic time and large scale struct ure environment in a field 
of 2 deg area dScoville et al.l l2007h . In this paper, we build 
mainly on the radio (VLA-COSMOS, ISchinnerer et al.l | 2007[) . 
X-ray (C/za«c/ra-COSMOS, or C-COSMO sTlElvis et alj |2009), 
and optical spectroscopic (Z-COSMOS. lLillv et al.ll2007l) obser- 
vations. The radio data were taken at 1.4 GHz and have a RMS 
noise 7-10 fiJy (with the faintest sources discussed in this paper 
having fluxes around 60juJy), while the X-ray data have a flux 



limit of 1 .9 



10~ 16 erg s- 1 



cm 2 in the 0.5-2 keV band. The area 



considered here is that covered by Chandra, which is a fraction 
(0.9 deg 2 ) of the whole COS MOS field. The XMM -Newton ob- 
servations (XMM-COSMOS. ICappelluti et alJ20"09l) covered the 
whole field but with a brighter flux limit (1 .7 ■ 10~ 15 erg s cm" 2 
in the 0.5-2 keV band). We focus on C-COSMOS here because 
its combination of area and limiting flux offers the best trade off 
for the subject of our study. 

The structure of this paper is as follows. In Sect.|2]we define 
a sample of galaxies based on radio and optical selection, and 
subsequent X-ray detection. An AGN sample is selected with 
the same method to allow for comparisons. In Sect.[3]we charac- 
terize the sample, in terms of magnitudes, redshifts, and optical 
spectra. In Sect. |4] we review some commonly used X-ray-based 
indicators of star formation vs. AGN activity, and test them on 
our sample of galaxies; on this basis, a refined sample is then de- 
fined. In Sect.|5]we investigate the average properties of galaxies 
with the same radio-optical selection but without a detection by 
Chandra. In Sect. [6] we discuss the number of composite and 
mis-classified sources. In Sect. [7] we consider if the COSMOS 



data can further constrain the radio/X-ray correlation. Sect. |8]is 
devoted to an analysis of the X-ray number counts of the radio- 
selected COSMOS galaxies. Finally, in Sect. [9] we review our 
conclusions. 

The cosmological parameters assumed in this paper are Ho = 
70 km s Mpc" 1 , Q A = 0.7 and Q M = 0.3. 

2. Selection criteria 

T he catalogue of t h e C OSMOS radio sources was published 
in ISchinnerer et al.l (120071) : the objects in this catalogue were 
then classified by S08, on the basis of their photometry-based 
method. Only objects with redshift < 1.2 were considered by 
S08, because the errors on the classification would be larger be- 
yond this threshold. AGN can be broadly divided in two classes: 
objects where the AGN dominates the entire Spectral Energy 
Distribution (SED), i.e. mainly QSO, and objects where it does 
not, such as type-II QSO, low luminosity AGN (Seyfert and 
LINER galaxies), and absorption-line AGN. 

The fraction of type-I QSO in the VLA-COSMOS cata- 
logue is small (~ 5%). They have higher X-ray luminosi- 
ties than the SF and the other kinds of AGN, making the 
few type-I objects easier to separate from other classes of ob- 
ject^ since we are mainly interested in the properties of galax- 
ies, we will not discuss them further. Hereafter with the term 
'AGN" we will only refer to the other kind of AGN (type- 
II and low-luminosity). These AGN have broad-band proper- 
ties similar to those of SF galaxies. The main tools to dis- 
entangle SF galaxies and low-luminosity AGN are spectro- 
scopic diagnostic diagrams (BPT diagr ams Bald win et alJ fl98 1: 



Veill eux & Osterbrockll987t iKewlev et alj|2001l) which rely on 
the [OIII 5007]/HyS and [Nil 6584]/Ha line ratios. Their main 
drawback is, however, the long telescope time needed to obtain 
good-quality spectra. Alternative methods which only build on 
photometric data can therefore be useful. 

Based on the observation of a tight correlation between rest- 
frame colours of emission-line gal axies and their position in the 
BPT diagram. ISmolcic et ail d2006l) used the Sloan Digital Sky 
Survey (SDSS) photometry (a modified Stromgren system) to i) 
calculate rest-frame colours, and ii) use the Principal Component 
Analysis (PCA) technique to identify, among all linear combina- 
tions of colours, those which correlate best with the position in 
the BPT diagram. One of these combinations, named PI, was 
found to correlate strong enough with the emission line proper- 
ties of SF galaxies and AGN, to be used alone for the classifica- 
tion. 

By applying this method to the COSMOS multi-band photo- 
metric data, S08 produced a list of 340 'star forming' (hereafter 
SF) and 601 type II/dusty/low luminosity AGN candidates. The 
Chandra field, which is smaller than the radio-surveyed area, 
contains 242 SF and 398 AGN. Some mis-classifications are in- 
herent in any colour- or line-based methods, and a fraction of 
objects may also exhibit composite or intermediate properties. 
Thus, S08 estimate that SF samples actually may contain ~ 20% 



1 The number of Chandra-detected type-I QSO is 11, out of 33 which 
are in the C-COSMOS field of view. The minimum 2-10 keV luminos- 
ity of these type-I QSO is 2.6 • 10 43 erg s _1 , therefore all type-I QSO 
would be selected as QSO/AGN (as opposed to galaxies) by the abso- 
lute luminosity criterion of Sect. 14.31 More in general the type-I and 
type-II classes have a different distribution of luminosities: the median 
2-10 keV luminosities are 6 ■ 10 43 and 7 • 10 42 erg s _I , respectively. 
Thus, we regard a simple X-ray luminosity argument to be sufficient to 
differentiate the S08 type-I QSO from the star forming galaxy and the 
type-II populations. 
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Fig. 1. Number of m atched sources in cross-correlating the C- 
COSMOS catalogue (lElvis et alJl2009h with the S08 sample of 
radio-selected star forming galaxies. 



of AGN and ~ 10% of composite objects. Conversely, AGN 
samples contain ~ 5% SF and ~ 15% composite. 

We matched the radio positions of the SF a nd AGN sources 
with those of the C-COSMOS catalogue (lElvis et all l2009t 
iPuccetti et alj|2009j) . In Fig. Q] we show the number of matched 
SF sources for different matching radii: the number of matches 
rises steeply from 0.1" to 0.5", and flattens for larger radii. To 
adopt a threshold for the maximum separation between the radio 
and X-ray coordinates, we considered that in the C-COSMOS 
survey some areas have been observed by Chandra only at large 
off-axis angles. For these areas, the point spread function (PSF) 
is much broader than the on-axis value (0.5" FWHM), and this 
can also introduce errors in the determination of the source posi- 
tion. The position errors reported in the C-COSMOS catalogue 
are in fact larger than 1" for 221 sources out of 1761. Thus we 
considered all matches within 3", and visually inspected every 
match to check that the X-ray PSFs and the errors on the X- 
ray positions were wide enough to justify the larger threshold. 
Following this criterion, one match was excluded because the 
PSF in that position was much narrower than the distance be- 
tween the radio and X-ray positions. 

The samples of radio selected, optically classified, X-ray de- 
tected sources consist thus of 33 SF (~ 14% of the SF-classified 
radio sources) and 82 AGN-type objects (~ 21% of the AGN- 
classified radio sources). The breakdown of the sources accord- 
ing to their detection in the 0.5-2 keV, 2-7 keV and 0.5-7 keV 
bands is shown in TableQ] Note the presence of 10 SF candidates 
lacking a detection in the soft band: this hints for the presence of 
AGN-type objects in the SF sample, which will be discussed in 
detail in the next sections. 



3. Optical and radio properties 

The samples of X-ray detected vs. undetected sources exhibit 
different properties, as shown both in the following tests and in 
the cumulative distributions in Fig. [2] For the SF sample we find 
that: 



Table 1. Number of X-ray detected radio sources, according to 
their optical classification and X-ray band of detection (F: full, 
0.5-7 keV; S: soft, 0.5-2 keV; H: hard, 2-7 keV). 



- detected sources have brighter (observer-frame) R magni- 
tudes than undetected ones, at a confidence level of 99.96% 
(according to a Wilcoxon-Mann- Whitney test; the median 
magnitudes are R — 20.53 and 21.66 for the detected and 
undetected, respectively); 

- detected sources have brighter radio fluxes than undetected, 
at a confidence level of 98.5% (median fluxes: S iaghz - 
0.148 and 0.124 mJy, respectively); 

- detected sources have lower redshifts than undetected, at a 
confidence level of 99.8% (median redshifts: z - 0.36 and 
0.61, respectively); 

- detected sources have lower radio luminosities than unde- 
tected, at a confidence level of 97.8% (median luminosities: 
S = 8.7 ■ 10 29 and 5.0 • 10 30 erg s" 1 Hz 1 , respectively). 

This is in line with the expectation of the detected sources being 
closer to us, and the undetected ones probing a larger volume 
where more luminous yet less common objects can be found. 

The X-ray detected AGN sources also have brighter R mag- 
nitudes and radio fluxes than the undetected ones, but do not 
show any significant difference in the redshift distribution. The 
behaviour of the radio luminosity is reversed: the undetected 
AGN have lower radio luminosities than the detected ones. 



Optical spectra of X-ray detected sources 

Optical spectra are available for most of the sources with 
an X-ray detection from several spectrosco pic campaigns: the 
ZCOSMOS project dLillvet aUl2007l f2009). Mag ellan/IMACS 
surveys, the Sloan Digital Sky Survey (SDSS), (Trump et al. 
120071 120091) and deeper observation^ with Keck/DEIMOS and 
VIMO S/VLT. A simple class ification based on diagnostic dia- 
grams (Bongio rno et al.l2010l) . further checked by visual inspec- 
tion, has been used to the determine the optical classifications 
(ICivano & et alJl2012h . Since the signal/noise ratio varies a lot 
in the sample, some objects have noisy spectra which can only 
be classified tentatively. 

In the SF sample, 21 objects (out of 33) are classified as 
emission line galaxies; 9 are classified as AGN, and 3 have no 
spectral information or have spectra with low signal/noise ratios. 

In the AGN sample, 25 objects (out of 82) are classified as 
AGN; 30 as emission line galaxies; 7 as absorption line galax- 
ies, and 20 have no spectral information or spectra with low sig- 
nal/noise ratio. 

This partial overlap in the optical classification between the 
two catalogues is expected (see also Sect. 0, because i) the two 
phenomena of accretion and star formation are often present to- 
gether in the same object, ii) the overlap of the areas covered by 
different populations (SF and AGN) in the diagnostic diagrams 
used by S08, Hi) low-luminosity, narrow-line AGN and actively 



Pis: Capak, Kartalpepe, Salvato, Sanders, Scoville. 
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Fig. 2. Relative cumulative distributions of redshift, R magnitude and radio flux in the C/ianafra-detected and undetected, SF and 
AGN galaxies subsamples. 



star forming galaxies can be difficult to distinguish in noisy spec- 
tra. The last point is particularly true at z > 0.4, where the Ha 
line is not sampled by optical spectra and, th erefore, the standard 
BPT diagram ([O III]/H/3 vs. [H II]/Ha; iBaldwin et all 1 19811) 
cannot be used in the optical spectral classification. A compari- 
son of the observed fraction of mis-classifications and composite 
objects with the expectations will be presented in Sect. [6] 

In the following Section, we will try and characterize fur- 
ther the two samples on the basis of the X-ray properties of the 
sources, with the aim of improving the classification. 



4. X-ray characterization of the selected sources 

X-ray spectra of SF galaxies are rather complex, as they include 
emission from hot gas, supernova remnants (thermal spectra) 
and X-ray binaries (non-thermal, power-law spectrum), with the 
thermal components being usually softer than the non-thermal 
ones. A detailed description of the expected spectra of the d iffer- 
ent components may be found in Persic & Rephaeli (2002). The 
relative importance of the spectral components may vary; how- 
ever, in most cases the average flux ratio between the 0.5-2.0 
keV and the 2.0-10 keV bands is the same that would be ob- 
tained if the spectrum was a power-law spectrum with spectral 
index F = 2.1 and negligible absorption (RCS03; Lehme r~et al.l 
2008). This does not imply the lack of X-ray absorption of X- 
rays in S F galaxies: M82 an d NGC3256 are notable examples 
(RCS03; Ran alli et al1l2008l) . However, the spectral analysis of 
23 SF galaxies in RCS03 did not find heavy absorption to be a 
general property of that sample. 

The fluxes of candidate SF objects used in this paper have 
been recomputed from the counts by assu ming the T = 2. 1 
spectrum, instead of the F = 1.4 used in lElvis et al.l J2009). 
Conversely, for AGN we have used the latter, harder spectrum. 
In the following, we review a few common indicators of SF 
vs. AGN activity, and apply them to the SF sample for further 
screening. 

4.1. Hardness ratio 

The hardness ratio (HR) is a simple measure of the spectral 
shape, defined as HR = (H — S)/(H + S), where H and S are 
the net (i.e. background-subtracted) counts in the hard and soft 
bands respectively. It is most useful when the sources are too 



faint for a proper spectral analysis. The error on the HR is usu- 
ally calculated by taking the un certainties on s ource and back- 
ground counts according to the !Gehrelsl (fl986) approximation, 
and applying the error propagation for Gaussian distributions. 
However, it has been shown that for faint sou rces this approac h 
largely overestimates the errors on the HR ([Park et al.l I2006 ) . 
To obtain a more realistic estimate of th e HR unce r taintie s, we 
use the Bayesian approach described in P ark et al.l (|2006). The 
source and local background counts have been extracted from 
the Chandra event files in soft (0.5-2 keV) and hard (2-7 keV) 
observer-frame energy bands. For the sources in the SF and AGN 
samples we show in Fig.[3]the median value of the HR posterior 
distributions, and the 68% and 90% highest probability density 
(HPD) intervals of the posterior distributions, vs. the redshifts. 
The HR corresponding to eight model spectra (absorbed power- 
laws) are also superimposed. These model HR have been calcu- 
lated with XSPEC, using average effective areas (see Sect. [5} 
and assuming no background. 

About half of the SF sources detected in C-COSMOS have 
a HR which is consistent with very flat or absorbed spectra 
(Nu <; lO 215 cirT 2 , or equivalently F < 1.2 without absorption). 
Conversely, about half of the AGN sources have a HR consistent 
with spectra steeper or less absorbed than the above thresholds. 

The average hardness ratio of galaxies is the one given by 
a non-absorbed F = 2.1 spectrum (RCS03). Harder spectra are 
sometimes indeed fo und in galaxies (e.g. , see the slopes for sin- 
gle power-law fits in Dahlem et al. 19981) where they can result 
from bright X-ray b inaries dPersic & R ephaeli 2002). However, 
ISwartz et al.l (|2004) found that the average slopes of X-ray bi- 
naries in nearby galaxies are F = 1.88 + 0.06 (1.97 ± 0.11) for 
binaries with luminosity larger (smaller) than 10 3 9 erg s -1 , re- 



spectively. Only 7% (8%) of the binaries studied bv lSwartz et al.l 
(120041) have slopes r < 1.4. 

The large number of hard objects in the SF sample is prob- 
ably due to a selection effect. Because of the C-COSMOS flux 
limit, the C/ianiira-detected SF sample only contains the bright- 
est 14% of all the SF objects in the field of view. In addition, 30% 
of the S08 SF sources are expected to be composite or misclas- 
sified, as is inherent in any selection method based on diagnostic 
diagrams. Since AGN are on average brighter than galaxies, the 
composite/misclassified objects should mingle with the brightest 
galaxies, hence with the C/zawc/ra-detected sample rather than 
with the Chandra-undetected one. 
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Fig. 3. Hardness ratios for the SF (left) and AGN (right) X-ray detected sources. The data points, boxes and whiskers show the 
median and the 68% and 90% credible intervals for the HR, respectively. The superimposed lines show the HR referring to model 
spectra: absorbed power-laws with two different slopes (F = 2.1, solid lines, and F = 1.4, dotted lines) and five column densities 
(Nr = 10 20 , 10 2 \ 10 21 - 5 , 10 22 and 10 225 cirT 2 from bottom to top). 



4.2. X-ray/optical flux ratio 

A fast and widely used, yet coarse method to classify X-ray ob- 
jects is to look at their X-ray/optical flux ratio X/O, defined as 

XI = Log (F x ) + 0AR + 5.71 (1) 

where Fx is the 0.5-2 keV flux, and R is the optical apparent 
magnitude in the R filter. On average, AGN have higher X/O 
values than SF galaxies. Given the intrinsic dispersion in the 
X/O values for both AGN and SF galaxies, no single X/O value 
can be used to unambiguously separate AGN from SF galaxies. 
However, it has been shown (Schmid fet al.l 1 1998L iBauer et al.l 
2004; I Alexander etap|2002l) that the value X/O = -1 can be 
taken as a rough boundary between objects powered by star for- 
mation (X/O < -1) and by nuclear activity (X/O > -1). In the 
SF sample, 22 objects have X/O < -1 (2/3 of the total sample), 
while 11 have X/O > -1. Out of this latter 11 objects, which 
this criterion would classify as AGN, 10 have a HR whose 68% 
HPD interval is consistent with a column density A^h S 10 21 5 

-2 

cm . 

For comparison, the AGN sample has only 1/3 of the objects 
with X/O < -1 (27 objects out of 82). 

4.3. X-ray luminosity 

An X-ray luminosity of 10 42 erg/s is also often used as another 
rough boundary between SF galaxies and AGN, with the band in 
which the luminosity is measured varying among different au- 
thors. While this criterion is, on its own, so unrefined that it ig- 
nores the existence of low luminosity AGN, it may still be of 
use when considered along with other criteria. In the SF sam- 
ple, 17 objects have a 2-10 keV luminosity greater than this 
limit. However, it is expected that the X-ray luminosity of SF 
galaxies evolves with the redshift; RCS05 found that a pure lu- 



minosity evolution of the form Lx °c (1 + z) 2J is a good d escrip- 
tion of the available data (see also iNorman et al.1 120041) . Thus 
one should consider as AGN candidates only the objects with 
Lx > 10 42 (1 + z) ln erg s _1 : 12 objects in the SF sample satisfy 
this criterion. All these 12 objects have a HR compatible with a 
column density larger than 10 21 ' 5 crrT 2 . 

4.4. Off-nuclear sources 

If the X-ray position is not coincident with the galaxy centre, 
but is still within the area covered by the galaxy in the optical 
band, then the X-ray emission is probably due to an off-nuclear 
X-ray binary. Thus any contribution from nuclear accretion is 
unlikely to be significant. An off-nu clear flag is present in the 
catal ogue of optical ide ntifications (ICivano & et al.l l20T2t see 
also iMainieri et al] f2010). The only SF source classified as off- 
nuclear is CXOC J100058.6+021 139. No source from the AGN 
sample is classified as off-nuclear. While in principle this is an 
important selection criterion, in practice it applies to only one 
object in our samples, and therefore we will not consider it any 
further. 



4.5. Classification and catalogue of sources 

From the considerations made above, it seems that only about 
half of the objects in the SF sample have properties compatible 
with SF-powered X-ray emission. This hints to the presence of 
several objects in the sample which show intermediate, rather 
than SF or AGN properties. Some refinement of the S08 SF se- 
lection criteria seems thus possible by inspecting the X-ray prop- 
erties of the sources. 

We consider the following conditions as indicators of SF ori- 
gin of the X-ray luminosity: 
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- £2-10 ^ 10 42 (1 + z) 2J erg s where L2-10 is the hard X-ray 
(2.0-10 keV) luminosity; 

- HR lower (softer) than what expected for an absorbed power- 
law with r = 1.4 and Nn = 10 22 cirT 2 ; 

- X/O < -1; 

- classification as galaxy from optical spectroscopy. 

For the purposes of classification, we only consider opti- 
cal spectra with clear AGN-like emission line ratios as non- 
matching. This avoids that low signal/noise spectra can influence 
the classification. 

Among many possibilities to combine to above criteria, we 
use the following method: the number of matched criteria is 
counted, and an object is classified accordingly. An object is as- 
signed to class 1, if it fulfils all the conditions; to class 2, if it 
fulfils all conditions but one; to class 3, if there are at least two 
conditions not satisfied. Objects for which one condition can not 
be checked (e.g., a missing optical spectrum), are classified as if 
that condition had been matched. The idea is that an object status 
is affected only by conditions which have been checked and not 
matched. 

Thus we recognize two samples of SF galaxies: one more 
conservative (class 1 objects), and another less conservative (ob- 
jects in classes 1 or 2). Class 3 objects probably do not have the 
majority of their X-ray emission powered by star formation re- 
lated processes. There are 8 class-1 objects; 8 class-2; and 17 
class-3 objects in the SF sample. 

If the same selection is applied to the AGN sample, we find 
14 class-1; 6 class-2; and 62 class-3. 

The classifications for both the SF and AGN sample are re- 
ported in Tables [2] and [3] along with the other parameters of 
interest: X-ray fluxes, luminosities, medians of the HR poste- 
rior probability distributions, X/ O ratios, X-ray/radio ratios (see 
Sect. [7]l, classification from optical spectroscopy. 

5. Average properties of undetected objects 

The method of stacking analysis allows to determine the aver- 
age properties of objects which are not individually detected; it 
can be briefly described as follows. Candidate objects for stack- 
ing are selected from the list of SF sources in S08 which are 
not detected in C-COSMOS, with the additional criterion that no 
detected C-COSMOS source should be present within 7 arcsec 
from the position of the candidate. The reason is to avoid con- 
tamination from X-ray brighter sources. This does not introduce 
any bias in the selection of sources, because very few sources 
are excluded in this way (only 6 out of 209). 

Because of the radio-X-ray correlation, it is expected that the 
average X-ray properties are dominated by the brightest radio 
sources. Thus it may be advisable to include in the stack only 
sources with a narrow spread in their radio flux, to avoid biases 
due to the brightest sources. We split the sample on the basis of 
the radio flux, dividing the candidates in two lists as follows: 

1. 156 SF sources with S iaghz ^ 0.20mJy; 

2. 43 SF sources with 0.20 < S x AGHz < 0.63 mJy. 

Each sub-sample is 0.5 dex wide in flux; one starts from the 
lowest radio flux, while the other one follows continuously. 

For each list of candidates, postage-stamp size images mea- 
suring 20 x20 pixels (each one 0.491" wide) around the position 
of every candidate, are extracted and summed; the latter sum is 
hereafter called "stacked image". If most candidates contribute 
some X-ray photons, then a "stacked source" appears on top of 
the background in the centre of the stacked image (similar results 



can b e obtained using the software described in Mivai i et al.l 
120081) . The wavdetect tool is then used to determine the net 
counts of the stacked source. This analysis was done for both 
the 0.5-2.0 keV and 2.0-7.0 keV bands. 

The stacked source was successfully detected by wavdetect 
for the low -radio flux sub-sample in both bands, and for the high- 
radio flux in the soft band only. This is in line with the expecta- 
tion that a lower number of counts should be present in the high- 
than in the low-radio-flux subsample: given the average fluxes 
and the number of positions (Table 0), ~ 40% more counts are 
expected in the latter than in the former. 

The significance of the detection of a stacked source is best 
estimated with simulations: we draw as many random positions 
as the number of sources in the list, reproducing the same spatial 
distribution of the VLA-COSMOS sources, and build a stacked 
image from these positions. Total counts (c s j m ) within a radius 
of 3.5" from the centre are extracted; the wavdetect software 
is run on the stacked image; this cycle is repeated 10000 times 
for each sub-sample and each band. We then define the following 
/^-values: 

- fdetect as the fraction of times that the wavdetect software 
finds a source within 1 . 1 arcsec of the centre of the stacked 
image; 

- pets as the fraction of times that c s i m > c stac k, where c stac k are 
the total counts of the 'real' stacked source. 

We identify the /^-values as two estimates of the probability that 
the stacked source was actually a background fluctuation. The 
/^-values are shown in Table [4] (actually, 1 - p is shown, i.e. the 
probability that the source is not a fluctuation). 

Using the source and background regions defined above, and 
a power-law average spectrum with F — 2 . 1 , we extracted the net 
counts and derived the fluxes and luminosities shown in Table|4] 

Stacked X-ray spectra have also been extracted for the two 
subsamples, using CIAO 4.0. Background spectra have been ex- 
tracted around the source positions, by taking 4 circular back- 
ground regions for each source, each background region having 
the same radius of the source region, and being placed 10" east, 
north, west, or south of the source. This ensures that the back- 
ground is the most accurate, given the actual sky positions of 
the sources. Then, we removed background positions which fell 
within 7" from any C/ianafra-detected source. Response matri- 
ces have been calculated considering the stacked source like it 
was an extended source consisting of many small pieces scat- 
tered around the detector area, weighted by the photons actually 
present in each position. 

The stacked spectra were fitted with a model which is the 
weighted sum of many absorbed power-laws; the number of 
power-law components is the number of sources in the stack. 
Each absorbed power-law is redshifted to the z of the corre- 
sponding source. The slope and absorption are free parameters, 
but are assumed to be the same for all sources. The weights are 
proportional to the radio fluxes. Finally, observer-frame absorp- 
tion due to the Galaxy is added. This method allows to fully 
account for the redshift distribution of the sources. We used 
XSPEC version 1 1.3.2 for this analysis. 

The confidence contours for the parameters are shown in 
Fig.|4](solid curves) for the bin with sources with S iaghz ^ 0.2 
mJy. The spectrum is consistent with moderately steep pho- 
ton indices (1.5 < F < 2.5). This behaviour is expected for 
star forming galaxies (see Sect. |4}. The bin with sources with 
0.2 < S iaghz ^ 0.63 mJy (not shown) has a lower number of 
X-ray photons in the spectrum and thus a wider range of slopes 
allowed, yet the spectrum is still consistent with the other bin. 
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Table 4. Average properties of radio-selected SF-candidates without an X-ray detection in C-COSMOS. Radio fluxes in mJy; X-ray 
fluxes in 1CT 18 erg s _1 cirT 2 ; X-ray luminosities (rest frame) in 10 40 erg s _1 . The counts are reported with their 68.3% error intervals. 
The fluxes have been calculated by fitting the stacked spectra with an imposed F — 2.1 (this slope has the characteristic of making 
the 0.5-2 and 2-10 keV fluxes approximately the same; considering the rounding of non-significant digits, this explains why the 
fluxes are the same in the two bands). All values for the soft band are relative to the 0.5-2 keV interval, while for the hard band 
counts are in the 2-7 keV, and fluxes and luminosities are in the 2-10 keV bands. The allowed spectral slope (F) range is given in 
the last column. 
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Fig. 4. Confidence contours for relevant parameters of simple 
models of stacked spectra of SF sources and (for comparison) 
AGN sources not detected by Chandra (sources with radio flux 
F < 0.20 mJy).The contours are shown at levels of AC: +2.92, 
+5.63, +8.34 above minima. Solid lines: SF sources; dotted 
lines: AGN. 



For comparison, in Fig. |4] (dotted curves) we show also the 
confidence contours for stacked spectra of 207 AGN, selected 
from the S08 sample in the same radio flux intervals of the SF 
galaxies, and also not detected by Chandra. The average red- 
shifts of these AGN are not significantly different from the SF 
ones. The AGN X-ray spectra are flatter; this is probably the 
result of absorption and of redshift (summing absorbed spec- 
tra of sources at different redshift leads to flat or inverted spec- 
tra, like in the case of the cosmic X-ray background). While a 
single power-law model is probably simplistic, a more detailed 
modeling would be beyond the scope of this paper. The average 
X-ray fluxes for the AGN, calculated with the best-fit slopes, are 
7.2 • 10~ 18 (4.8 • 10~ 17 ) erg s" 1 cirT 2 for the bin with S lAGHz < 0.2 
mJy and 3.0 • 10~ 18 (1.0 • 10~ 17 ) erg s" 1 cirT 2 for the bin with 
0.2 < S i.4GH z < 0.63 mJy in the 0.5-2 (2-10) keV band. The 
allowed ranges for the AGN spectral slopes are [0.4-0.9] and 




Radio flux (mJy) 

Fig. 5. Radio vs. X-ray (0.5-2 keV) fluxes for the VLA- 
COSMOS SF sources detected in C-COSMOS. Detected 
sources are marked with squares and error bars. The black 
squares with an attached arrow mark the sources which are not 
detected in the band to which the panel refers, but are detected in 
any other of the C-COSMOS bands. Conversely, the grey upper 
limits show the sources without a detection in any X-ray band, 
hence without an entry in the C-COSMOS catalogue. The solid 
line shows the RCS03 relationship, K-corrected to the average 
redshift of the detected sources (z - 0.46), while the dashed 
lines show the lx and 3x standard deviation of the relationship. 



[0.6-2.2] for the S IA gh z < 0.2 mJy and 0.2 < Si AGHz < 0.63 
mJy bins, respectively. 



6. Mis-classified and composite objects 

The fractions of X-ray detected objects whose classification 
based on optical spectral line ratios (where available) is dif- 
ferent from that based on the synthetic PI colour in S08 are 
9/33 ~ 27% (SF sample) and 37/82 ~ 45% (AGN sample). 
These numbers are of the same magnitude of those quoted in 
Sects. 14.114.21 14.31 though it is important to stress that differ- 



8 



P. Ranalli et al.: X-ray properties of radio-selected star forming galaxies in the CVianrfra-COSMOS survey 



ent criteria yield different mis-classified and composite (here- 
after MCC) objects. 

These fractions should be compared with the number of class 
3 objects in the SF sample (17/33 ~ 51%) and of classes 1-2 in 
the AGN sample (20/82 ~ 24%). 

However, these fractions do not take into account the large 
number of X-ray undetected objects and should probably be con- 
sidered as upper limits to the true fractions of MCC objects. In 
fact, the stark difference found between the average X-ray spec- 
tra of undetected SF and AGN sources (Sect|5]l would rather sug- 
gest much lower fractions of MCC. A lower limit to the true 
fractions of MCC can be derived by assuming that the MCC 
are only present among the X-ray detected sources. In this case, 
the fractions would be 17/242 ~ 7% (20/398 ~ 5%) for the 
class 3 (classes 1-2) in the SF (AGN) sample. The rationale for 
this assumption would be, for the SF sample, that galaxies with 
an AGN contribution are on average brighter in X-rays than the 
galaxies without and thus are more likely to be X-ray detected. 
For the AGN sample, it would be that intense star formation 
could cause X-ray emission at a level similar to that of a low- 
luminosity or absorbed active nucleus. 

The fractions reported by S08 (30% of MCC in the SF and 
20% in the AGN sample; see Sect. [2]) are intermediate between 
our upper and lower limits' estimates, and therefore we regard 
them as in agreement with our findings. In the following, we use 
the method described in Sect. 14.51 to identify the MCC candi- 
dates. The same method cannot be applied to X-ray undetected 
objects, and therefore the optical colour-based classification is 
used for them in the remaining of this paper. 

7. The X-ray/radio flux ratio 

Correlations between X-ray and radio luminosities of star form- 
ing galaxies, and between the X-ray and far infrared (FIR) ones, 
are well established f or the local univer se and h ave been tested 
for obj ects up to z ~ 1 dBauer et al.l2002t RCS03; lGirfanov et al l 
l2004at IPersic & Rephaelill2007l) . Both the radio and FIR lumi- 
nosity are tracers of the star formation rate (SFR). These corre- 
lations are linear, and imply that in absence of any contribution 
from an AGN, the X-ray emission is powered by star-formation 
related processes. High-Mass X-ray Binaries (HMXB) seem to 
have the same luminosity function in all galaxies, only nor- 
malised according to the actual SFR. Thus, if HMXB are the 
dominant contributors to the X-ray emission, the X-ray lumi- 
nosity i s a tracer of the SF R (Grim m et al. l l200llGilfanovet alJ 
20043 IPersic & Rephaelill2007l) . The other possibly dominant 
contributors to the X-ray emission are the Low-Mass X-ray 
Binaries (LMXB), whose number scales with the galaxy stellar 
mass. The actual balance of the two populations thus depends 
on the ratio between SFR and mass, thus it is possible that there 
is some evolution of the correlation due to the mass build-up by 
the galaxies with the cosmic time. 

The VLA- and C-COSMOS surveys contain a sizable num- 
ber of objects at medium-deep redshifts on which the correlation 
might be tested. However, many objects have radio fluxes close 
to the VLA-COSMOS flux limit. This is clearly visible in Fig.|5J 
where we show the radio and X-ray fluxes and upper limits for 
all the SF galaxies in the C-COSMOS field. Since we account 
for the X-ray upper-limits, the main source of potential bias is 
the radio flux limit; the results may therefore be biased towards 
radio-brighter-than-X-rays objects. To partially mitigate this ef- 
fect, we split the sample in two redshift bins, defined as follows. 

The knee of the radio LF of galaxies is found at a luminos- 
ity L k ~ 1.5 ■ 10 29 erg s _1 Hz -1 at 1.4 GHz (see, e.g., Fig.2 in 



RCS05). We define the first bin as the redshift interval in which 
all objects with luminosity > II^ can be observed. This corre- 
sponds to z < 0.2. In other words, this redshift threshold guaran- 
tees that the luminosities around L k are included in the sampled 
volume. While this bin does not contain a strictly volume-limited 
subsample, here the selection bias should be mitigated as much 
as possible, and still the bin includes a sizable number of objects. 

The second bin contains the remaining objects, i.e. those 
with z > 0.2. Finally, we refer to calculations made on all ob- 
jects as a third bin named "any-z". 

For most of the objects, the 0.5-2 keV flux limit is a few 
times larger than what expected from the radio flux limits and 
the radio/X-ray correlation. Many objects therefore only have X- 
ray upper limits, which need to be properly accounted for. The 
2-10 keV limit is about one order of magnitude larger and thus 
this band is not considered in this Section. 

The hypothesis we would like to test is if the COSMOS data 
are consistent with the extrapolation of the RCS03 correlation, 
or if they require substantially different parameters. This kind of 
question is usually what Bayesian methods are most suited to 
answer. The ratio 



q = Log(Fo.5-2kev/S 1.4GHz) 



(2) 



may be defined in the same way of the analogous q parameter of- 
ten employed in te sting the radio/FIR correlation of spiral galax- 
ies (ICondon| ["l992). Non-detections in X-rays therefore lead to 
upper limits for the q\ and can be properly accounted for. 

To include the K-correction in the g's, we consider the in- 
dividual redshifts of the sources and assume average power-law 
spectra with average slopes. Using X-rays (0.5-2.0 keV lumi- 
nosity less than 10 4 2 erg s~') to remove bright AGN from th e 
VLA-CDFS survey dKellermann et alJl2008HTozzi et alj |2009), 
a radio spectral energy index a ~ 0.69 can be assumed for the 
galaxies. 

The radio/X-ray correlation is described by its slope (here as- 
sumed unity), its mean q, and its standard deviation <x. The two 
latter parameters are the subject of the present statistical anal- 
ysis, and need prior probability distributions, which we take as 
follows: 

- <Zprior is assumed to follow a normal distribution with mean 
qo and standard deviation cr taken equal to the values found 
by RCS03 in the local universe (q = 11.10 and <t = 0.24); 

- the standard deviation <T prior is assumed to be uniformly dis- 
tributed. 

In Fig. the prior distribution is represented as the grey shade. 

The Bayesian posterior probabiliti es were ca lculated with 
the Montecarlo method (Meeker & Escobar| [l998l) ; the credible 
contours^ for the mean and standard deviations of q are also 
shown in Fig. [6] along with the RCS03 estimate. 

The posterior distribution for q and cr is consistent with the 
RCS03 values, within the 68.3% HPD area for the z < 0.2 and 
any-z bins, and within the 90% Highest Posterior Density (HPD) 
area for the z > 0.2 bin. However, the centres of the posterior 
distributions of the z > 0.2 and the any-z bins appear to be shifted 
to lower values of q. 

The z > 0.2 and the any-z bins allow a larger <x than RCS03 
(by a factor < 2), while the z < 0.2 bin also allows smaller cr. The 
reasons for the larger dispersion may include the large number 



3 It may be worth reminding that in the Bayesian framework one 
speaks of credible contours and intervals, leaving the word confidence 
for frequentist statistics in order to avoid confusion. 
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Fig. 6. Credible contours for the logarithmic X-ray /radio ratio 
q. The Bayes formula was used to derive the posterior probabil- 
ity for q, shown as contours (levels relative to 50%, 68.3% and 
90%). The blue continuous and the green long-dashed contours 
refer to SF objects with z < 0.2 and z > 0.2, respectively. The 
red short-dashed contours refer to SF objects at any redshift (i.e., 
the sum of the two bins). The green long-dashed arrow shows 
the correction that should be made to q for the z > 0.2 and any-z 
bins to account for the bias due to radio selection. The grey- scale 
background shows the prior (a darker grey means a higher prob - 
ability density). The yellow asterisk shows the qo value from 
RCS03. 



of upper limits, uncertainties on the K-correction, and a residual 
contamination by AGN of the objects with X-ray upper limits. 

One possible explanation of a smaller q at high redshift is 
that the radio luminosity is evolving in redshift at a faster pace 
than the X-ray one. The possibilities for an increased efficiency 
of the radio emission might include different details of cosmic 
ray acceleration and propagation, or larger magnetic fields at 
high redshift. However, it has been shown that magnetic fields 
of a stre ngth similar to thos e found in local galaxies are in place 
at z ~ 1 dBernet et al.ll2008l) . 

The two redshift bins can also be seen, approximately, as two 
luminosity bins. Thus, a different inter pretation may be that q is 
luminosity dependent, as suggested by Svmeonidis et alJ d201 ll) 
that the X-ray /far-infrared ratio may be lower in UltraLuminous 
InfraRed Galaxies (ULIRGs) than in normal galaxies with lower 
luminosities. If the analysis presented in this Section is repeated 
by dividing the sample in two luminosity bins (radio luminosi- 
ties lower and higher than 4 • 10 29 erg s _1 Hz 1 , which is the me- 
dian luminosity of the SF sample), then a picture quite similar 
to Fig. [6] is obtained: the high-luminosity bin favours a smaller 
q than the low-luminosity bin, yet there is a sizable fraction of 
the parameter space which is allowed by both bins, and which 
contains the RCS03 value. 

However, the redshift bins whose data prompt for the evo- 
lution only sample the high-luminosity tail of the radio lumi- 
nosity function, and it is possible that a selection bias is part 
of the explanation: it may be that radio-luminous objects have 
lower X-ray luminosity t han average. In fact, it has been shown 
(IVattakunnel et al.l |2012) that galaxies in the Chandra Deep 
Fields, whose average X-ray luminosity is about one order of 
magnitude lower than those presented here, follow the same X- 
ray /radio correlation of galaxies in the local universe. 



An estimate of the bias due to the radio selection may be 
done with the method employed bv lSargent et alJ (1201 Ol) for the 
infrared/radio corre l ation (see also lKellermannlll964t ICondonl 
1984; lFrancislll993"t lLauer et al.ll2007l) . in which the bias for a 
flux limited survey (where the luminosity function of the sources 
is not fully sampled) depends only on the scatter of the correla- 
tion cr and on the slope of the differential number counts /3: 

Aq bias = ln(10)OS- Da 2 -0.18 (3) 

where /3 ~ 2.35 (see Sect. [8]) and cr-0.24. The amount of correc- 
tion is shown in Fig.[6]as the green dashed arrow. Applying this 
correction to the z > 0.2 and any-z bins would mostly remove the 
redshift (or luminosity) evolution of q. This correction needs not 
to be applied to the z < 0.2 bin because here the luminosity func- 
tion should be almost correctly sampled. However, this correc- 
tion should be only taken as a first-order approximation, because 
one of its hypotheses is that the scatter of the X-ray/radio corre- 
lation is described by a Gaussian function. Because the A^ias 
is sensitive to the shape of wings of the function, any deviation 
of the correlation fro m a Gaussian wou ld require Eq. (|3) to be 
modified accordingly (lLauer et al.ll2007l) . 

For these reasons, a further investigation of this issue with 
deeper observations in both the radio and X-ray bands, and in- 
cluding a proper statistical treatment of truncated dataQ could be 
an interesting subject for a follow-up analysis. 

8. Demographics of star forming galaxies 

An important test for the selection criteria described so far is 
to check for the size of the population of candidate SF X-ray 
galaxies. In this Section, we describe four different alternatives, 
which are plotted in Fig. [7] The gala xy X-ray numbe r counts 
from the Chandra Deep Fields (CDFS [Xue et all201 lb are also 
shown for reference. 

First, we consider the 22 class 1 objects from Tables |2]and 
[3] This is the strictest selection that we discuss, since it requires 
that an object is radio-detected, and that all the X-ray based cri- 
teria are fulfilled. It should thus be considered as a lower limit. 
We keep the assumption that a criterion is considered fulfilled if 
the relevant data are lacking, as done in Sect. 14.51 The number 
counts for this selection are plotted in Fig. [7] as the dotted red 
histogram. 

The addition of the 14 class 2 objects to the above selection 
gives the solid red histogram, with the errors shown as the grey 
area. We only plot the errors for this selection, in order not to 
clutter the figure; however, they can be taken as representative 
of the other alternatives. The ratio between the class- 1 -only and 
the class< 2 histograms is about a factor of 2 at high X-ray fluxes 
and less than that at lower fluxes. 

A different approach is to discard the radio selection cri- 
terion, and to apply the X-ray based c riteria to the whole C - 
COSMOS catalogue dElvis etalJ 120091: ICivano & et aH l2012h . 
This opens the possibility to include a larger number of com- 
posite SF/AGN and of low luminosity AGN in the selection. A 
number of 63 class-1 and 192 class-2 objects are selected in this 
way. The dotted and solid black histograms show the result for 
the class-1 and class< 2 objects, respectively. While the class-1 

4 In statistical nomenclature, upper limits to the fluxes of objects oth- 
erwise detected at a different wavelength are an example of censored 
data, while a flux-limited survey for which no information about the 
existence of objects at fluxes lower than the limit is an example of trun- 
cated data. Truncated data cannot be trea ted with the same tools valid 
for censored data (see, e.g., Lawless 2003). 
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Fig. 7. Number counts of SF sources 
detected in the VLA- and C-COSMOS 
surveys, compared to the CDFS deter- 
minations. 

Dotted red histogram: VLA+C- 
COSMOS class- 1 SF candidates. 
Solid red histogram, with grey error 
area: VLA+C-COSMOS class- 1,2 SF 
candidates. Dotted black histogram: 
C-COSMOS class- 1 SF candidates 
(not requiring radio detection). Solid 
black histogram: C-COSMOS class- 1,2 
SF candidates (not requiring radio 
detection). Green histogra m: galaxy 
candidates in the CDFS iXue et aU 
201 1); their selection criteria are similar 
to ours for the dotted red histogram. The 
thick yellow line and the horn-shaped 
symbol refer to the total (AGN+SF) 
number counts and fluct uations in the 
Chandra Deep Fields iMoretti et al.1 
2003; iMivaii & Griffifhsl2002al lbl)~ 



histogram lies a factor of ~ 2-3 above the red ones and is still 
within the \-2cr errors for the above determinations, a much 
larger difference (a factor of ~ 7) is observed for the class< 2 
histogram. 

The latter Log A^-Log S should be regarded as a likely over- 
estimate for the X-ray galaxy number counts. In fact, if this his- 
togram were extrapolated at fainter fluxes, it would predict a 
number of galaxy which , summed to expected number of AGN 
from synthesis models (Gil li et al.l 120071: iTreister et al.l |2009), 
would be incompatible with the observed total Log A^-Log S . 
A further hint comes from the integration of the galaxy X-ray 
luminosity function: the theoretical Log N-Log S relations in 
RCS05 would lie on average a factor of 3 below the histogram. 
(The same Log A^-Log S are however consistent with the other 
three determinations). 

The galaxy number counts were derived bv lXue etal 1 (T20TT1) 
for the 4Ms CDFS by considering the following criteria: X-ray 
luminosity, photon index, X/O, optical spectroscopic classifica- 
tion, and X-ray/radio ratio. The criteria were joined is a similar 
manner to what done here for the class- 1 sources. It is thus not 
surprising that, although each threshold is somewhat different 
from what used in this paper, the final result is very similar to 
the counts of radio-selected galaxies presented here. A power- 
law with the form Log A^(> S) = -1.35 L og S - 19.1 5 may thus 
be considered a useful description of both lXue et al.l (1201 ll) and 
our determinations of the galaxy X-ray number counts. 



9. Conclusions 

We have presented the X-ray properties of a sample of 242 SF 
galaxies in the C-COSMOS field, selected in the radio band and 
classified according to the optical colours with the method de- 
scribed in S08. This method builds on the definition of a syn- 
thetic rest-frame colour, which can be calculated from narrow- 
band photometry in several bands, and which h as been shown 
to cor relate with the position in the BPT diagram (Smolci c et al.l 
2006). It has a similar power to the BPT diagram, with the ad- 
vantage of not requiring expensive spectral observations. 

In Chandra observations, 33 objects were detected. A com- 
parison sample of 398 candidate type-II AGN (with 82 detec- 
tions) is also presented. 

We have reviewed some X-ray based selection criteria com- 
monly used in the literature, and analyzed how they affect the 
composition of the SF and AGN samples. We have thus refined 
the SF sample, and recovered some objects from the AGN one, 
on the basis of the following parameters: 

- hardness ratio; 

- X-ray luminosity; 

- X-ray/optical flux ratio; 

- classification from optical spectroscopy. 

This is a small yet effective set of indicators based only on X- 
ray and optical prop erties. We also m ention that a similar method 
has been applied by iXue et all (1201 ll) in the Chandra Deep Field 
South. 
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We have proposed two refined subsamples of C-COSMOS 
SF galaxies, based on the absence of AGN-like properties, one 
(class-1) being more strict in its criteria than the other (class< 2), 
containing 8 and 16 objects respectively. If the same method is 
applied to the AGN sample, 14 objects may be recovered as SF 
under the stricter method, and 20 under the more liberal one; 
these objects may be composite, or may have been misclassified 
by the optical colour method. 

Of 33 detections in the SF sample, 17 exhibit AGN-like 
properties in terms of hardness ratio, non-detection in the 0.5- 
2.0 keV band, optical spectrum, X-ray/optical flux ratio and ab- 
solute X-ray luminosity. Among 82 detections in the AGN sam- 
ple, 20 have SF-like properties. Thus the fraction of compos- 
ite/misclassified objects is 50% for the SF sample and 25% for 
the AGN, while S08 reported fractions of 30% and 20%, respec- 
tively. The larger fractions observed here are likely to be ex- 
plained as a selection effect, due to the fact that AGN are on 
average brighter than galaxies in the X-rays. 

Conversely, the stacked spectra of X-ray undetected SF and 
AGN are significantly different: the SF one can be fit with power- 
law spectra with 1.5 < F < 2.5, while the AGN one is flatter 
(0.4 < T < 0.9). This suggests that the two samples do have 
different physical properties and that the fractions of compos- 
ite/misclassified are actually lower for X-ray-undetected objects. 
Thus we regard the fraction of mis-classified and/or composite 
objects to be in line with the expectations from S08. 

We have investigated if the radio/X-ray luminosities correla- 
tion (RCS03) applies to our data, and if there is any evidence for 
redshift evolution of the correlation parameters. A subsample of 
SF objects at z < 0.2 yields an X-ray/radio ratio fully consistent 
with the local RCS03 estimate. Data at larger redshifts are still 
consistent with the local value. Some evolution towards lower 
X-ray/radio ratios is possible, but at least part of the evolution 
may be explained by selection biases arising in flux-limited sur- 
veys. Further analysis of deeper data, or the use of statistical 
techniques appropriate to truncated data may be necessary. 

We have presented the number counts of the C-COSMOS SF 
galaxies according to different selection criteria, and compared 
them to the number counts of the CDFS galaxies (Xu e et all 
1201 ll) . Considering only the radio-selected class-1, or the radio- 
selected class< 2, or dropping the radio selection and only con- 
sidering class-1, all lead to estimates which are consistent to 
each other within the l-2cr errors. Dropping the radio selection 
and considering class< 2 objects gives an overestimate of the 
galaxy number counts. 

Further observations of the COSMOS field with Chandra 
would allow a much better determination of the X-ray demo- 
graphics of the SF galaxies at the redshifts probed in this paper. 
By extending the coverage to the full 2 deg 2 , with a uniform ex- 
posure of 180 ks over 1.7deg 2 (the HST-observed area), the final 
size of the X-ray detected COSMOS SF galaxy sample should 
be of the order of 200. 

The AEGIS survey (Nandra et al. 2005) is similar in method- 
ology to COSMOS, and currently has an observed area and 
flux limit similar to C-COSMOS, and recent observations have 
added deeper coverage (uniform exposure of 800 ks) to a 0.6 
deg 2 sub-field. Even deeper is the exposure (4 Ms) in the 
Chandra Deep Field South (CDFS) field in the GOODS survey 
Though the probed are a is smaller (0.2 deg 2 ; IXue etalj|20lU 
IVattakunnel et al.ll2012l) . the CDFS already provides 179 objects 
classified as galaxies (24% of the total). This latter field also has 
a 3 Ms coverage wit h XMM-Newton, whi ch is providing good 
quality spectroscopy (IComastri et al.ll20TTl) . 



The surveys described above have also extensive optical 
spectroscopy, and have been observed in the far infrared by 
Spitzer and Herschel. The inclusion of infrared data could pro- 
vide a further improvement in the object classification, and to- 
gether with optical photometry could allow to break down the 
AGN and host galaxy contributions for the composite objects. 

Finally, these data sets and the classifications done insofar 
could be used as testbeds for innovative statistical methods in 
object recognition and classification. This would be especially 
useful in light of the future large surveys, both in X-rays (e.g., 
eROSITA) and in optical (LSST, Pan-STARRS, SNAP). 
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Table 2. Catalogue of radio-selected SF-candidate sources with an X-ray detection in C-COSMOS. The columns report: VLA and Chandra names, Chandra ID from lElvis etal] d2009h . redshifts, 
X-ray fluxes (in erg s~' cirr 2 ) and luminosities (rest-frame; in erg s -1 ) for the soft (0.5-2.0 keV) and hard (2.0-10 keV) bands; rest-frame hardness ratios; X-ray/optical flux ratios; X-ray/radio flux 
ratio q\ classification from optical spectroscopy (A: absoiption line galaxy; E: emission line galaxy; L: spectrum with low signal/noise ratio; Tl: type-I AGN; T2: type-II AGN); final classification 
(see Sect.l4~5l. 
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Table 3. Catalogue of radio-selected AGN-candidate sources with an X-ray detection in C-COSMOS. Columns and footnotes as in Tabl^2] 
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Table 3. continued. 
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